A Novel Technique for Path Completion in Web Usage Mining
نویسنده
چکیده
World Wide Web is a huge repository of web pages and links. The Web mining field encompasses a wide array of issues, primarily aimed at deriving actionable knowledge from the Web, and includes researchers from information retrieval, database technologies, and artificial intelligence. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Most data used for mining is collected from Web servers, clients, proxy servers, or server databases, all of which generate noisy data. Because Web mining is sensitive to noise, data cleaning methods are necessary. Web usage mining consists of three phases preprocessing, pattern discovery and pattern analysis. Web log data is usually noisy and ambiguous and data preprocessing system for web usage mining is an important process. A data preprocessing includes data cleaning, user identification, session identification and path completion. The inexact data in web access log are mainly caused by local caching and proxy servers which are used to improve performance and minimize network traffic. The proposed method uses path completion algorithm to preprocess the data. The proposed path completion algorithm efficiently appends the lost information and improves the consistency of access data for further web usage mining calculations.
منابع مشابه
User Interest Level Based Preprocessing Algorithms Using Web Usage Mining
Web logs take an important role to know about user behavior. Several pattern mining techniques were developed to understand the user behavior. A specific kind of preprocessing technique improves the quality and accuracy of the pattern mining algorithms. The existing algorithms have done the preprocessing activities for reducing the size of the log file and to identify the number of unique users...
متن کاملA Survey on Preprocessing Methods for Web Usage Data
World Wide Web is a huge repository of web pages and links. It provides abundance of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Because of the tremendous usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the applicati...
متن کاملUsing Petri Nets to Enhance Web Usage Mining1
Precise analysis of the web structure can facilitate data processing and enhance the accuracy of the mining results in the procedure of web usage mining. Many researchers have identified that pageview identification and path completion are of great importance in the result of web usage mining. Currently, there is still a lack of an effective and systematic method to analyze and deal with the tw...
متن کاملUse of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems
One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...
متن کاملA Survey of Preprocessing Method for Web Usage Mining Process
The amount of web applications are increasing in large amount and users of web applications are also increasing rapidly with high speed. By increasing number of users the size of log file also increases .The information which stores in log files cannot be directly used for analysis. Therefore preprocessing of log files is necessary to improve the quality of web usage mining process. Preprocessi...
متن کامل